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Abstract 

Many complex systems, from power grids and the internet, to the brain and society, 
can be modeled using modular networks. Modules, densely interconnected groups 
of elements, often overlap due to elements that belong to multiple modules. The 
elements and modules of these networks perform individual and collective tasks such as 
generating and consuming electrical load, transmitting data, or executing parallelized 
computations. We study the robustness of these systems to the failure of random 
elements. We show that it is possible for the modules themselves to become isolated 
or uncoupled (non-overlapping) well before the network falls apart. When modular 
organization is critical to overall functionality, networks may be far more vulnerable 
than expected. 



Complex networks have recently attracted much interest due to their prevalence in nature 
and our daily lives [3, 21]. A critical network property is its resilience or robustness to random 
breakdown and failure [4, 11, 9, 12], typically studied as a percolation problem [27, 1, 10, 24], 
or cascading failures [16, 8, 23]. Meanwhile, most networks are modular [14, 20], comprised 
of small, densely connected groups of nodes. The modules often overlap, with elements 
belonging to multiple modules [22, 2]. Existing work on robustness has not considered the 
role of modular structure. 

Consider a system of interacting elements representing computers, power generators, neu- 
rons, etc. These elements perform tasks sufficiently complex that they must work together 
in densely interconnected modules. These tasks may be parallelized computations, pro- 
tein biosynthesis, or higher-order neurological functions such as visual processing or speech 
production. Elements are required to communicate between modules, so that modules are 
coupled or overlapping, and the system functions properly only when modules can commu- 
nicate. We ask how these networks respond when a random fraction of elements fail: do the 
modules become uncoupled before the network loses global connectivity? Random failures 
provide a toy model of, e.g., a traumatic brain injury or degenerative disease. If enough 
elements fail, the modules can no longer communicate (higher brain functions are lost) even 
though the network may remain connected (simpler autonomic responses persist). Likewise, 
an individual module may fail if too many of its member elements cease to function. 

Modular structure can be represented as a bipartite network (Fig. la) [18, 19] character- 
ized by two degree distributions, r m and s n , governing the fraction of elements that belong 
to m modules and the fraction of modules that contain n elements, respectively. The average 
number of modules per element is fi = Yl m mr m and the average number of elements per 
module is v = ^ n ns n . We derive two networks from the bipartite graph by projecting 
onto either the elements or the modules: One is the network between elements, while the 
other is a network where each node represents a module and two modules are linked if they 
share at least one element. The giant component in the element network disappears when 
the network loses global connectivity; in the module network it vanishes when the modules 
become uncoupled (non-overlapping). Before projection elements fail with probability 1—p 
and are removed from the network. Meanwhile, a module is unable to complete its collective 
task if fewer than a critical fraction f c of its original elements remain. These failed modules 
are removed from the module network but any surviving member elements are not removed 
from the element network. See Fig. lb. 

We wish to determine S(p), the fraction of remaining nodes within the giant component 
as a function of p, for both the element and module networks. We define four generating 
functions [18, 19]: 
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Figure 1: The modular network representation [18, 19]. (a) We obtain two networks by pro- 
jecting onto elements or modules, (b) The failure of element 3 induces the failure of module 
B, uncoupling the remaining modules, even though the network itself remains connected. 



to m modules, (f\) a random element within a randomly chosen module to belong to m 
other modules, (go) a random module to contain n elements, and (gi) a random module of 
a randomly chosen element to contain n other elements. 



1 Element network 

Consider a randomly chosen element A that belongs to a group of size n. Let P(k\n) be the 
probability that A still belongs to a connected cluster of k nodes (including itself) in this 
group after failures occur: 

P(k\n)={^~J^p k -\l-pT~ k . (2) 

The generating function for the number of other elements connected to A within this group 
is 

n 

h n (z) = P(k\n)z k - 1 = (zp+l- p) 11 - 1 . (3) 

k=l 

Averaging over module size: 

K z ) = -^2ns n hn(z) = gi(zp+ 1 -p). (4) 

n=0 

The total number of elements that A is connected to, from all modules it belongs to, is then 
generated by 

G (z) = f (h(z)). (5) 
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Likewise, the total number of elements that a randomly chosen neighbor of A is connected 
to is generated by 

G 1 {z) = f l {h{z)). (6) 

Before determining S, we first identify the critical point p c where the giant component 
emerges. This happens when the expected number of elements two steps away from a random 
element exceeds the number one step away, or 

9 2 Go(G 1 (^))| z=1 -9 z Go(^)| z=1 >0. (7) 

Substituting Eqs. (5) and (6) gives f (l)h\l)\f[(l)h'(l) - 1] > or f[(l)ti(l) > 1. Finally, 
the condition for a giant component to exist, since h'(l) = pg[(l), is 

Pf[(l)g[(l) > 1. (8) 

For the uniform case, r m = 5(m, jj) and s n = <5(n, z/), this gives — l){v — 1) > 1. If [i = 3 
and v = 3, then the transition occurs at p c = 1/4. 

To find S, consider the probability u for element A to not belong to the giant component. 
A is not a member of the giant component only if all of A's neighbors are also not members, 
so u satisfies the self-consistency condition u = Gi{u). The size of the giant component is 
then S= 1-G (u). 



2 Module network 

Consider a random module C and then a random member element A. Let Q(£\m) be the 
probability that C is connected to £ modules, including itself, through element A, who was 
originally connected to m modules including C: 

^H=(7r 1 1 )^ i (i-^, o) 

where 

qi = -Y. ns -Yl u _ ly-^-pY- 1 - (io) 

n=0 i=x ^ ' 

(Notice that q± = 1 when x(n) = [n/ c ] = 1 for all n.) The generating function j m for the 
number of modules that C is connected to, including itself, through A is 

m 

3m{z) = Q^m)/- 1 = (z qi + 1 - q^- 1 . (11) 
l=\ 

Once again, averaging j m over memberships gives 

j oo 

H z ) = - $Z mr mj m (z) = fi(zq! + 1 - qi). (12) 

^m=0 
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The total number of modules that C is connected to is not generated by g (j(z)) but by 
go(j(z)), where the & are the generating functions for module size after elements fail: 

~g (z) = J2 *nz\ = ^ USn l ■ (13) 

The probability to have k member elements remaining in a module after percolation is 
given by 

k E„E^(r> fc, (i-P)™-^n 1 ] 

The denominator is necessary for normalization since we cannot observe modules with fewer 
than \nf c ] members. Notice that s n = s n when s n = 5(n, u) and \nf c ] = n = v. 

Finally, the total number of modules connected to C through any member elements is 
generated by F (z) = go(j(z)) and the total number of modules connected to a random 
neighbor of C is generated by Fi(z) = gi(j(z)). As before, the module network has a giant 
component when d z F (Fi(z))\ z= i — d z F (z)\ z= i > and S = 1 — Fq(u) = 1 — go(j(u)), where 
u satisfies u = Fi(u) = g\{j{u)). 

For the uniform case with fi = 3, u = 3, and f c > 2/3, the critical point for the module 
network is p c = 1/2, a considerably higher threshold than for the element network (p c = 1/4). 
In Fig. 2 we show S for /j, = 3 and v — 6. The "robustness gap" between the element and 
module networks widens as the module failure cutoff increases, covering a significant range 
of p for the larger values of f c . 

Of particular interest are scale-free networks [7, 28, 21]. Here we take r rn = S(m : /i) as 
before, but now s n ~ n _A , with A > 2 1 . It is known that scale-free networks are robust to 
random failures when 2 < A < 3 (meaning that p c — > 0). However, this result also requires 
that the maximum value K of the degree distribution be large (K ^> 1) [11]. Indeed, as 
we lower A, we discover that, while we increase the robustness of the elements, we actually 
decrease the robustness of the modules (Fig. 3). For modular networks, it may not be 
feasible to build extremely large modules. Interestingly, enforcing on s n a maximum module 
size cutoff N = max{n | s n > 0} only improves element robustness. 



3 Empirical results 

We study failures in multiple social, biological, and informational real-world datasets (see 
App. A). Unlike the model, we do not know the modules in advance, so we estimate them with 
an overlapping community algorithm [2] (a second method [22] displays similar behavior). 
These networks tend to be smaller than those previously discussed, introducing finite-size 
effects that mask the behavior of S. To overcome this, we instead use S', the fraction of 
original nodes that remain in the giant component (see App. B). As shown in Fig. 4, the 

1 The degree distribution after projection remains scale-free (with the same exponent), although the 
maximum degree may increase. 
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Figure 2: The size of the giant component S for r m = S(m, n),s n = 5(n, u), with \i = 3 and 
v = 6. Theory and simulations confirm that the network undergoes a transition from coupled 
to non-overlapping modules well before it loses global connectivity. Symbols represent the 
element (□) and module (O) networks. 

modules fall apart more easily than the elements, qualitatively matching our model across a 
broad range of networks. 

4 Conclusions 

There are a number of interesting avenues for further work. We considered the simplest 
case of random failures but extensions to purposeful attacks (failure proportional to n or 
m) are also important. Likewise, the model we use assumes that all links exist within 
modules, but links between modules are certainly possible. These additional links can only 
enhance the robustness of the element network, but will not improve the module network, 
so that the robustness gap may be significantly increased. Beyond structural characteristics 
of these modular networks it is important to understand the effect of failures and modular 
structure on critical phenomena such as synchronization [6, 5], contact processes [15, 26], 
cascades [16, 8, 23] or other dynamics [13]. 

Finally, this work can also help us to understand how empirical networks are affected by 
missing data, of critical importance when studying communities. Here p is the probability 
that a network element is successfully captured by an experiment, such as a high-throughput 
biological assay or web crawler. The robustness gap can explain how non-overlapping com- 
munity methods may succeed in networks where overlap is expected: the network is sampled 
down to the intermediate regime where nodes are connected but modules are uncoupled. 
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Figure 3: Robustness of scale-free networks. Here r m = S(m,3), s n ~ n~ x , f c = 1/2, 
and N = max{n | s n > 0}. Increasing N and decreasing A, measures known to improve 
the robustness of scale-free networks, actually magnifies the robustness gap. Surprisingly, 
this also increases the fragility of the module network, indicating that optimizing against 
structural failure may worsen the network's functional resilience. 

Acknowledgments 

We thank H. Rozenfeld, F. Simini, Y.-R. Lin, D. Wang, D. ben-Avraham, and A.-L. Barabasi 
for many useful discussions, and Hartwig Siebner and Kristoffer Madsen at Hvidovre Hos- 
pital's Danish Research Centre for Magnetic Resonance for normal-patient fMRI data. The 
authors acknowledge the Center for Complex Network Research, supported by the James S. 
McDonnell Foundation, the NSF, NIH, US ONR and ARL, DTRA, and NKTH NAP. 

A Datasets 

In this work, we study six empirical networks. The Word Association, Metabolic, and 
Protein-Protein Interaction (all) networks were previously used in [2]; details are available 
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Figure 4: (color online). We simulate failures in a number of real networks, from functional 
brain networks to WWW hyperlinks and collaborative social networks. Many of these net- 
works are robust to random failures (the element networks exhibit very small p c ), but all 
networks The behavior of the giant component for all the empirical networks qualitatively 
matches that of the model, as the identified modules uncouple faster than the network itself. 
Shaded regions provide a guide to the eye for the robustness gap (/ c = 0.7). For full dataset 
details, see App. A. 



there. The Web Links network is constructed from a web crawl made available by Google; 
see http://google.com/programming-contest/. The Collaborations network is constructed 
between authors who share at least one publication on the arXiv:cond-mat system [17]. The 
Brain network was derived using normal patient fMRI data where each node is a "voxel" 
dividing the brain spatially and links exist between voxels whose respective BOLD time 
series are correlated (measured using Normalized Mutual Information). We begin with the 
top 200k most correlated links. A single voxel had very high degree, k = 0.73N (the next 
highest degree is k = 0.096iV) so we first remove it. This leaves 5038 nodes and 196311 
links. We further preprocess this dense network by extracting its multiscale backbone [25] 
(a = 0.37), giving a final network of 5038 nodes and 77680 links. For all networks, link 
communities were extracted at the level of maximum partition density [2], providing the 
estimated modules. 



B Finite-size effects 

For the empirical networks analyzed in Fig. 4, we modified our definition of the quantity S 
due to finite-size effects. There are two sources for these effects: (i) the number of modules is 
often much smaller than the number of elements, so that a small network of a few thousand 
elements may only have a few hundred modules; and (ii) the rate at which elements fail 
may be slower than the rate at which modules fail (the former is simply given by p but 
the latter also depends on s n and / c ). We suppress these effects by choosing S with a 
well-behaved denominator as p — > 0. Specifically, our options are S(p) = N gcc (p)/N(p) or 
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Figure 5: (color online). For the empirical datasets in Fig. 4, we present here the same data 
for the original definition of S, the fraction of remaining nodes within the giant component. 
For some of the networks the transition points are more dramatic in this representation, 
however for many it is difficult to determine their location due to strong finite-size effects. 



5" = N gcc (p) I TV '(1) , where N gcc (p) is the number of nodes (either elements or modules) within 
the largest component at percolation probability p and N(p) is the total number of nodes at 
percolation probability p. The quantity 5" has better behavior under the above conditions, 
although the transition appears less dramatic than it does for S. In Fig. 5 we present the 
same as Fig. 4 using the original definition of S. 
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